Adiya Chopra 20BPS1163 Mugdha Kondhare 20BPS1095
The basic idea of analysing the Zomato dataset is to get a fair idea about the factors affecting the establishment of different types of restaurant at different places all over the world, aggregate rating of each restaurant. With each day new restaurants opening the industry hasn’t been saturated yet and the demand is increasing day by day. In Spite of increasing demand, it has become difficult for new restaurants to compete with established restaurants. Most of them serve the same food. Most of the people are dependent mainly on restaurant food as they don’t have time to cook for themselves. With such an overwhelming demand for restaurants it has therefore become important to study the demography of a location. This kind of analysis can be done using the data, by studying the factors such as • Location of the restaurant • Approx. Price of food • Restaurants and their quality of food as per public rating • Which locality of that city serves that cuisines with maximum number of restaurants • Is a particular neighbourhood famous for its own kind of food.
Problems identified:- 1. Identifying the various strategies adopted by various restaurants is a hectic task. 2. Analysing how Zomato is different from its competitors . 3. Analysing the perfect location and perfect cuisines for restaurants to open at certain places. 4. Analysing which food is famous in which places 5. Analysing relationships between restaurants, food and ratings To make it easier for the user to understand, all of these issues need to be resolved through proper visualisation.
The dataset contains the following features 1. Restaurant Id: This feature contains the Id of the restaurant on the Zomato website 2. Restaurant Name: The name of the restaurant 3. Country Code: Code of the country 4. City: Contains the neighbourhood in which the restaurant is located. 5. Address: This feature contains the address of the restaurant in Delhi-NCR 6. Locality: Contains the neighbourhood in which the restaurant is located 7. Locality Verbose: Exact place in that locality. 8. Longitude: Longitude of restaurant 9. Latitude: Latitude of restaurant 10. Cuisines: Type of meal 11. Average Cost for Two: Average cost when two people eating 12. Currency: Currency of that country 13. Has Table Booking: Table book option available or not 14. Has Online Delivery: Online delivery option available 15. Is Delivering Now: Food delivered currently or not 16. Switch to Order Menu: Order menu available 17. Price Range: Price ranges available 18. Aggregate Rating: Contains the overall rating of the restaurant out of 19. Rating Colour: Colour of rating 20. Rating Text: Rating numerical value 21. Votes: Contains total number of upvotes for the restaurant
#packages
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.2.2
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.2
#packageVersion('rlang')
#remove.packages('rlang')
#install.packages('rlang')
Reading Excel File into a variable.
#Changing Directory to Current Working Directory
#Reading CSV file into a variable
zomato <- read.csv('D:/VIT/CSE3505/zomato.csv.xls')
#zomato
Cleaning dataset
sum(is.na(zomato))
## [1] 0
#Inference: The data set is clean. It does not have any NA values.
Describing the dataset.
View(zomato)
str(zomato)
## 'data.frame': 9551 obs. of 21 variables:
## $ Restaurant.ID : int 6317637 6304287 6300002 6318506 6314302 18189371 6300781 6301290 6300010 6314987 ...
## $ Restaurant.Name : chr "Le Petit Souffle" "Izakaya Kikufuji" "Heat - Edsa Shangri-La" "Ooma" ...
## $ Country.Code : int 162 162 162 162 162 162 162 162 162 162 ...
## $ City : chr "Makati City" "Makati City" "Mandaluyong City" "Mandaluyong City" ...
## $ Address : chr "Third Floor, Century City Mall, Kalayaan Avenue, Poblacion, Makati City" "Little Tokyo, 2277 Chino Roces Avenue, Legaspi Village, Makati City" "Edsa Shangri-La, 1 Garden Way, Ortigas, Mandaluyong City" "Third Floor, Mega Fashion Hall, SM Megamall, Ortigas, Mandaluyong City" ...
## $ Locality : chr "Century City Mall, Poblacion, Makati City" "Little Tokyo, Legaspi Village, Makati City" "Edsa Shangri-La, Ortigas, Mandaluyong City" "SM Megamall, Ortigas, Mandaluyong City" ...
## $ Locality.Verbose : chr "Century City Mall, Poblacion, Makati City, Makati City" "Little Tokyo, Legaspi Village, Makati City, Makati City" "Edsa Shangri-La, Ortigas, Mandaluyong City, Mandaluyong City" "SM Megamall, Ortigas, Mandaluyong City, Mandaluyong City" ...
## $ Longitude : num 121 121 121 121 121 ...
## $ Latitude : num 14.6 14.6 14.6 14.6 14.6 ...
## $ Cuisines : chr "French, Japanese, Desserts" "Japanese" "Seafood, Asian, Filipino, Indian" "Japanese, Sushi" ...
## $ Average.Cost.for.two: int 1100 1200 4000 1500 1500 1000 2000 2000 6000 1100 ...
## $ Currency : chr "Botswana Pula(P)" "Botswana Pula(P)" "Botswana Pula(P)" "Botswana Pula(P)" ...
## $ Has.Table.booking : chr "Yes" "Yes" "Yes" "No" ...
## $ Has.Online.delivery : chr "No" "No" "No" "No" ...
## $ Is.delivering.now : chr "No" "No" "No" "No" ...
## $ Switch.to.order.menu: chr "No" "No" "No" "No" ...
## $ Price.range : int 3 3 4 4 4 3 4 4 4 3 ...
## $ Aggregate.rating : num 4.8 4.5 4.4 4.9 4.8 4.4 4 4.2 4.9 4.8 ...
## $ Rating.color : chr "Dark Green" "Dark Green" "Green" "Dark Green" ...
## $ Rating.text : chr "Excellent" "Excellent" "Very Good" "Excellent" ...
## $ Votes : int 314 591 270 365 229 336 520 677 621 532 ...
dim(zomato)
## [1] 9551 21
head(zomato)
## Restaurant.ID Restaurant.Name Country.Code City
## 1 6317637 Le Petit Souffle 162 Makati City
## 2 6304287 Izakaya Kikufuji 162 Makati City
## 3 6300002 Heat - Edsa Shangri-La 162 Mandaluyong City
## 4 6318506 Ooma 162 Mandaluyong City
## 5 6314302 Sambo Kojin 162 Mandaluyong City
## 6 18189371 Din Tai Fung 162 Mandaluyong City
## Address
## 1 Third Floor, Century City Mall, Kalayaan Avenue, Poblacion, Makati City
## 2 Little Tokyo, 2277 Chino Roces Avenue, Legaspi Village, Makati City
## 3 Edsa Shangri-La, 1 Garden Way, Ortigas, Mandaluyong City
## 4 Third Floor, Mega Fashion Hall, SM Megamall, Ortigas, Mandaluyong City
## 5 Third Floor, Mega Atrium, SM Megamall, Ortigas, Mandaluyong City
## 6 Ground Floor, Mega Fashion Hall, SM Megamall, Ortigas, Mandaluyong City
## Locality
## 1 Century City Mall, Poblacion, Makati City
## 2 Little Tokyo, Legaspi Village, Makati City
## 3 Edsa Shangri-La, Ortigas, Mandaluyong City
## 4 SM Megamall, Ortigas, Mandaluyong City
## 5 SM Megamall, Ortigas, Mandaluyong City
## 6 SM Megamall, Ortigas, Mandaluyong City
## Locality.Verbose Longitude
## 1 Century City Mall, Poblacion, Makati City, Makati City 121.0275
## 2 Little Tokyo, Legaspi Village, Makati City, Makati City 121.0141
## 3 Edsa Shangri-La, Ortigas, Mandaluyong City, Mandaluyong City 121.0568
## 4 SM Megamall, Ortigas, Mandaluyong City, Mandaluyong City 121.0565
## 5 SM Megamall, Ortigas, Mandaluyong City, Mandaluyong City 121.0575
## 6 SM Megamall, Ortigas, Mandaluyong City, Mandaluyong City 121.0563
## Latitude Cuisines Average.Cost.for.two
## 1 14.56544 French, Japanese, Desserts 1100
## 2 14.55371 Japanese 1200
## 3 14.58140 Seafood, Asian, Filipino, Indian 4000
## 4 14.58532 Japanese, Sushi 1500
## 5 14.58445 Japanese, Korean 1500
## 6 14.58376 Chinese 1000
## Currency Has.Table.booking Has.Online.delivery Is.delivering.now
## 1 Botswana Pula(P) Yes No No
## 2 Botswana Pula(P) Yes No No
## 3 Botswana Pula(P) Yes No No
## 4 Botswana Pula(P) No No No
## 5 Botswana Pula(P) Yes No No
## 6 Botswana Pula(P) No No No
## Switch.to.order.menu Price.range Aggregate.rating Rating.color Rating.text
## 1 No 3 4.8 Dark Green Excellent
## 2 No 3 4.5 Dark Green Excellent
## 3 No 4 4.4 Green Very Good
## 4 No 4 4.9 Dark Green Excellent
## 5 No 4 4.8 Dark Green Excellent
## 6 No 3 4.4 Green Very Good
## Votes
## 1 314
## 2 591
## 3 270
## 4 365
## 5 229
## 6 336
tail(zomato)
## Restaurant.ID Restaurant.Name Country.Code City
## 9546 5915054 Baltazar 208 \xdb\xc1stanbul
## 9547 5915730 NamlÛ± Gurme 208 \xdb\xc1stanbul
## 9548 5908749 Ceviz A\xdb\xf4acÛ± 208 \xdb\xc1stanbul
## 9549 5915807 Huqqa 208 \xdb\xc1stanbul
## 9550 5916112 A\x81\xf4\x81\xf4k Kahve 208 \xdb\xc1stanbul
## 9551 5927402 Walter's Coffee Roastery 208 \xdb\xc1stanbul
## Address
## 9546 Kemanke\x81\xf4 Karamustafa Pa\x81\xf4a Mahallesi, KÛ±lÛ±\xed_ Ali Pa\x81\xf4a Mescidi Sokak, No 12/A, Beyo\xdb\xf4lu, \xdb\xc1stanbul
## 9547 Kemanke\x81\xf4 Karamustafa Pa\x81\xf4a Mahallesi, RÛ±htÛ±m Caddesi, No 1/1, KatlÛ± Otopark AltÛ±, Beyo\xdb\xf4lu, \xdb\xc1stanbul
## 9548 Ko\x81\xf4uyolu Mahallesi, Muhittin \xed\xecst\xed_nda\xdb\xf4 Caddesi, No 85, KadÛ±k\xed_y, \xdb\xc1stanbul
## 9549 Kuru\xed_e\x81\xf4me Mahallesi, Muallim Naci Caddesi, No 56, Be\x81\xf4ikta\x81\xf4, \xdb\xc1stanbul
## 9550 Kuru\xed_e\x81\xf4me Mahallesi, Muallim Naci Caddesi, No 64/B, Be\x81\xf4ikta\x81\xf4, \xdb\xc1stanbul
## 9551 Cafea\xdb\xf4a Mahallesi, BademaltÛ± Sokak, No 21/B, KadÛ±k\xed_y, \xdb\xc1stanbul
## Locality Locality.Verbose Longitude
## 9546 Karak\xed_y Karak\xed_y, \xdb\xc1stanbul 28.98110
## 9547 Karak\xed_y Karak\xed_y, \xdb\xc1stanbul 28.97739
## 9548 Ko\x81\xf4uyolu Ko\x81\xf4uyolu, \xdb\xc1stanbul 29.04130
## 9549 Kuru\xed_e\x81\xf4me Kuru\xed_e\x81\xf4me, \xdb\xc1stanbul 29.03464
## 9550 Kuru\xed_e\x81\xf4me Kuru\xed_e\x81\xf4me, \xdb\xc1stanbul 29.03602
## 9551 Moda Moda, \xdb\xc1stanbul 29.02602
## Latitude Cuisines Average.Cost.for.two
## 9546 41.02578 Burger, Izgara 90
## 9547 41.02279 Turkish 80
## 9548 41.00985 World Cuisine, Patisserie, Cafe 105
## 9549 41.05582 Italian, World Cuisine 170
## 9550 41.05798 Restaurant Cafe 120
## 9551 40.98478 Cafe 55
## Currency Has.Table.booking Has.Online.delivery Is.delivering.now
## 9546 Turkish Lira(TL) No No No
## 9547 Turkish Lira(TL) No No No
## 9548 Turkish Lira(TL) No No No
## 9549 Turkish Lira(TL) No No No
## 9550 Turkish Lira(TL) No No No
## 9551 Turkish Lira(TL) No No No
## Switch.to.order.menu Price.range Aggregate.rating Rating.color Rating.text
## 9546 No 3 4.3 Green Very Good
## 9547 No 3 4.1 Green Very Good
## 9548 No 3 4.2 Green Very Good
## 9549 No 4 3.7 Yellow Good
## 9550 No 4 4.0 Green Very Good
## 9551 No 2 4.0 Green Very Good
## Votes
## 9546 870
## 9547 788
## 9548 1034
## 9549 661
## 9550 901
## 9551 591
Summarizing the dataset.
summary(zomato)
## Restaurant.ID Restaurant.Name Country.Code City
## Min. : 53 Length:9551 Min. : 1.00 Length:9551
## 1st Qu.: 301962 Class :character 1st Qu.: 1.00 Class :character
## Median : 6004089 Mode :character Median : 1.00 Mode :character
## Mean : 9051128 Mean : 18.37
## 3rd Qu.:18352292 3rd Qu.: 1.00
## Max. :18500652 Max. :216.00
## Address Locality Locality.Verbose Longitude
## Length:9551 Length:9551 Length:9551 Min. :-157.95
## Class :character Class :character Class :character 1st Qu.: 77.08
## Mode :character Mode :character Mode :character Median : 77.19
## Mean : 64.13
## 3rd Qu.: 77.28
## Max. : 174.83
## Latitude Cuisines Average.Cost.for.two Currency
## Min. :-41.33 Length:9551 Min. : 0 Length:9551
## 1st Qu.: 28.48 Class :character 1st Qu.: 250 Class :character
## Median : 28.57 Mode :character Median : 400 Mode :character
## Mean : 25.85 Mean : 1199
## 3rd Qu.: 28.64 3rd Qu.: 700
## Max. : 55.98 Max. :800000
## Has.Table.booking Has.Online.delivery Is.delivering.now Switch.to.order.menu
## Length:9551 Length:9551 Length:9551 Length:9551
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Price.range Aggregate.rating Rating.color Rating.text
## Min. :1.000 Min. :0.000 Length:9551 Length:9551
## 1st Qu.:1.000 1st Qu.:2.500 Class :character Class :character
## Median :2.000 Median :3.200 Mode :character Mode :character
## Mean :1.805 Mean :2.666
## 3rd Qu.:2.000 3rd Qu.:3.700
## Max. :4.000 Max. :4.900
## Votes
## Min. : 0.0
## 1st Qu.: 5.0
## Median : 31.0
## Mean : 156.9
## 3rd Qu.: 131.0
## Max. :10934.0
#Inferences
#The mean of average cost for two is 1199.
#Mean price range is 1.805
The most used Currency.
#zomato$City
#Substituting special characters with empty string
zomato$Currency <- gsub("[^[:alnum:]]", "", zomato$Currency, perl=TRUE)
zomato$Currency <- gsub("[^a-zA-Z0-9]", "", zomato$Currency, perl=TRUE)
ggplot(zomato, aes(x=zomato$Currency, fill=zomato$Currency))+geom_bar()
## Warning: Use of `zomato$Currency` is discouraged.
## ℹ Use `Currency` instead.
## Use of `zomato$Currency` is discouraged.
## ℹ Use `Currency` instead.
#Inference:
#IndianRupees(Rs) is the most used currency among Zomato's customers
The city to in which Zomato delivers the most
#Substituting special characters with empty string
zomato$City <- gsub("[^[:alnum:]]", "", zomato$City, perl=TRUE)
zomato$City <- gsub("[^a-zA-Z0-9]", "", zomato$City, perl=TRUE)
ggplot(zomato, aes(x=zomato$City, fill=zomato$City))+geom_bar()
## Warning: Use of `zomato$City` is discouraged.
## ℹ Use `City` instead.
## Use of `zomato$City` is discouraged.
## ℹ Use `City` instead.
freq1 <- table(zomato$City)
freq1
##
## AbuDhabi Agra Ahmedabad Albany
## 20 20 21 20
## Allahabad Amritsar Ankara Armidale
## 20 21 20 1
## Athens Auckland Augusta Aurangabad
## 20 20 20 20
## Balingup Bandung Bangalore Beechworth
## 1 1 20 1
## Bhopal Bhubaneshwar Birmingham Bogor
## 20 21 20 2
## Boise Brasilia CapeTown CedarRapidsIowaCity
## 20 20 20 20
## Chandigarh ChathamKent Chennai Clatskanie
## 18 1 20 1
## Cochrane Coimbatore Colombo Columbus
## 1 20 20 20
## Consort Dalton Davenport dbc1stanbul
## 1 20 20 14
## Dehradun DesMoines DickyBeach Doha
## 20 20 1 20
## Dubai Dubuque EastBallina Edinburgh
## 20 20 1 20
## Faridabad Fernley Flaxton Forrest
## 251 1 1 1
## Gainesville Ghaziabad Goa Gurgaon
## 20 25 20 1118
## Guwahati HepburnSprings Huskisson Hyderabad
## 21 2 1 18
## Indore InnerCity Inverloch Jaipur
## 20 2 1 20
## Jakarta Johannesburg Kanpur Kochi
## 16 6 20 20
## Kolkata LakesEntrance Lakeview Lincoln
## 20 1 1 1
## London Lorn Lucknow Ludhiana
## 20 1 21 20
## Macedon Macon MakatiCity Manchester
## 1 20 2 20
## MandaluyongCity Mangalore Mayfield McMillan
## 4 20 1 1
## MiddletonBeach Miller Mohali Monroe
## 1 1 1 1
## Montville Mumbai Mysore Nagpur
## 1 20 20 20
## Nashik NewDelhi Noida OjoCaliente
## 20 5473 1080 1
## Orlando PalmCove Panchkula PasayCity
## 20 1 1 3
## PasigCity Patna Paynesville Penola
## 3 20 1 1
## Pensacola PhillipIsland Pocatello Potrero
## 20 1 20 1
## Pretoria Princeton Puducherry Pune
## 20 1 20 20
## QuezonCity Ranchi Randburg RestofHawaii
## 1 20 1 20
## RiodeJaneiro Sandton SanJuanCity SantaRosa
## 20 11 2 2
## Savannah Secunderabad Sharjah Singapore
## 20 2 20 20
## SioPaulo SiouxCity Surat TagaytayCity
## 20 20 20 1
## TaguigCity TampaBay Tangerang Tanunda
## 4 20 2 1
## TrenthamEast Vadodara Valdosta Varanasi
## 1 20 20 20
## Vernonia VictorHarbor VinelandStation Vizag
## 1 1 1 20
## Waterloo Weirton WellingtonCity WinchesterBay
## 20 1 20 1
## Yorkton
## 1
barplot(freq1, xlab="City", ylab="Frequency")
freq1[which.max(freq1)]
## NewDelhi
## 5473
freq1[which.min(freq1)]
## Armidale
## 1
#Inference:
#The maximum Zomato deliveries are made to New Delhi
#The minimum Zomato deliveries aee made to Armidale
boxplot(freq1)
Plotting the longitude and latitude to visualize the various restaurants’ locations collected in the dataset.
#library(ggmap)
ggplot(zomato, aes(x=Longitude, y=Latitude))+geom_point()
#install.packages("mapview")
#install.packages("tidyverse")
#install.packages("sf")
#install.packages("terra")
#install.packages("remotes")
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ tibble 3.1.8 ✔ purrr 0.3.4
## ✔ tidyr 1.2.1 ✔ stringr 1.4.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(mapview)
## Warning: package 'mapview' was built under R version 4.2.2
library(terra)
## terra 1.6.17
##
## Attaching package: 'terra'
##
## The following object is masked from 'package:tidyr':
##
## extract
library(remotes)
## Warning: package 'remotes' was built under R version 4.2.2
#remotes::install_github("r-spatial/mapview")
mapview(zomato, xcol="Longitude", ycol="Latitude", crs = 4269, grid=FALSE)